PARADE: Passage Representation Aggregation for Document Reranking

نویسندگان

چکیده

Pre-trained transformer models, such as BERT and T5, have shown to be highly effective at ad-hoc passage document ranking. Due the inherent sequence length limits of these they need process passages one a time rather than processing entire once. Although several approaches for aggregating passage-level signals into document-level relevance score been proposed, there has yet an extensive comparison techniques. In this work, we explore strategies from document’s final ranking score. We find that representation aggregation techniques can significantly improve over proposed in prior taking maximum call new approach PARADE. particular, PARADE results on collections with broad information needs where spread throughout (such TREC Robust04 GOV2). Meanwhile, less complex may work better often pinpointed single DL Genomics). also conduct efficiency analyses highlight improving transformer-based aggregation.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Learning Adaptable Patterns for Passage Reranking

This paper proposes passage reranking models that (i) do not require manual feature engineering and (ii) greatly preserve accuracy, when changing application domain. Their main characteristic is the use of relational semantic structures representing questions and their answer passages. The relations are established using information from automatic classifiers, i.e., question category (QC) and f...

متن کامل

Encoding Semantic Resources in Syntactic Structures for Passage Reranking

In this paper, we propose to use semantic knowledge from Wikipedia and largescale structured knowledge datasets available as Linked Open Data (LOD) for the answer passage reranking task. We represent question and candidate answer passages with pairs of shallow syntactic/semantic trees, whose constituents are connected using LOD. The trees are processed by SVMs and tree kernels, which can automa...

متن کامل

Multi-Document Summarization via Discriminative Summary Reranking

Existing multi-document summarization systems usually rely on a specific summarization model (i.e., a summarization method with a specific parameter setting) to extract summaries for different document sets with different topics. However, according to our quantitative analysis, none of the existing summarization models can always produce high-quality summaries for different document sets, and e...

متن کامل

Document Expansion for Cross-Lingual Passage Retrieval

This article describes the participation of the joint Elhuyar-IXA group in the ResPubliQA exercise at QA&CLEF 2010. In particular, we participated in the English–English monolingual task and in the Basque– English cross-lingual one. Our focus was threefold: (1) to check to what extent information retrieval (IR) can achieve good results in passage retrieval without question analysis and answer v...

متن کامل

Rich Document Representation for Document Clustering

In traditional document clustering models, a document is considered as a bag of words. In this paper we present a new method for generating feature vectors, using the sentence fragments that are called logical terms and statements, in PLIR system. PLIR is a Knowledge-Based Information system based on the theory of the Plausible Reasoning. We have conducted a number of experiments using OHSUMED ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: ACM Transactions on Information Systems

سال: 2023

ISSN: ['1558-1152', '1558-2868', '1046-8188', '0734-2047']

DOI: https://doi.org/10.1145/3600088